[00:01.180 --> 00:07.380]  All right, welcome back, everybody. Next talk up is File Encryption for Actual Humans by David
[00:07.380 --> 00:14.460]  Kane-Perry. Take it away. Thank you. All right, so welcome to File Encryption for Actual Humans.
[00:14.460 --> 00:18.720]  This is going to be a proof of concept of modern cryptography and human-centered design in about
[00:18.720 --> 00:23.560]  100 lines of Python. My name is David Kane-Perry, and while I currently work at Spotify, wrestling
[00:23.560 --> 00:28.240]  with all things cryptography and then some, this talk has nothing to do with my day job.
[00:28.240 --> 00:33.360]  This project, however, has a bit to do with the previous job. I was onboarding with the U.S.
[00:33.360 --> 00:37.660]  federal government, and background checks for public trust positions are based on the same
[00:37.660 --> 00:42.980]  criteria used to grant or deny security clearances, but are submitted by emailing forms rather than
[00:42.980 --> 00:49.440]  by web app. This here is just the first of 58 pages. The problem I faced then, and that a lot
[00:49.440 --> 00:53.940]  of people continue to face in similar situations, is that I had to share all that rather confidential
[00:53.940 --> 01:01.380]  data, and it had to be in an email. But email is fantastically insecure. I mean, the principal
[01:01.380 --> 01:05.680]  problem is that email is a store-and-forward protocol. That's just a fancy way of saying
[01:05.680 --> 01:09.540]  that there's an intermediary between the sender and the receiver who's responsible for message
[01:09.540 --> 01:14.960]  delivery. But it requires trust in the operator of the intermediary. If green and blue can't
[01:14.960 --> 01:21.060]  accept the risk of trusting red, they probably shouldn't be using email. But wait, it gets worse.
[01:21.060 --> 01:25.340]  Even if green and blue can accept the risk of trusting red, can they also do that for purple
[01:25.340 --> 01:30.900]  and for gray? Because even no purple and gray are in the loop. Because email may hop across
[01:30.900 --> 01:35.360]  multiple systems before it reaches its destination, the operator of each system needs to be trusted
[01:36.000 --> 01:42.600]  so to neither read nor modify green's email to blue. Oh, but the list of intermediaries can't be
[01:42.600 --> 01:48.540]  known in advance. So even if you have transport security between green and blue, each hop decrypts
[01:48.540 --> 01:52.880]  green's email and then re-encrypts it when forwarding it on. None of the trust issues with
[01:52.880 --> 01:58.880]  red, purple, and gray are addressed, only potentially the risk of already unauthorized access by an
[01:58.880 --> 02:03.600]  attacker in a privileged network position. Aha, you might be thinking to yourself, I can solve this
[02:03.600 --> 02:09.020]  problem by sharing the file through a hosting service. Well, bad news. That model still collapses
[02:09.020 --> 02:13.800]  to sending an email. The email that green sends to blue with the link to the file is as insecure
[02:13.800 --> 02:18.500]  as if green had included the file in the email itself. There are no shortcuts to turning an
[02:18.500 --> 02:24.140]  insecure channel into a secure one. Speaking of trying to turn an insecure channel into a secure
[02:24.140 --> 02:30.580]  one, a quick digression on GPG. It's worse than email. Unlike GPG, email is something that people
[02:30.580 --> 02:35.200]  want to use. For both the US federal government and the less technically sophisticated members
[02:35.200 --> 02:38.800]  of your family, perhaps, it's what they're comfortable with. And for different reasons,
[02:38.800 --> 02:42.760]  they're both highly averse to installing any third-party software. Even if you could help
[02:42.760 --> 02:47.900]  them out of their comfort zone, tools like GPG have proven over and over to be unworthy of our
[02:47.900 --> 02:53.720]  trust. So here's the threat model me and many others are dealing with, but it's okay if it
[02:53.720 --> 02:59.180]  doesn't match yours. First, it's got to be an email. No signal or whatever other tool we would
[02:59.180 --> 03:05.580]  prefer to use. Second, no file hosting URLs. It's just as insecure. And besides, with what I assume
[03:05.580 --> 03:09.760]  are the best of intentions, folks have been trying to train others to simply not click on links in
[03:09.760 --> 03:16.000]  email. To borrow the southern US phrase, bless their hearts. Also, no third-party software. The
[03:16.000 --> 03:20.060]  US federal government has many barriers in place to prevent this, and they're not about to lower
[03:20.060 --> 03:25.040]  them just for me. For your less sophisticated family members, even if they wanted to, could
[03:25.040 --> 03:29.760]  even manage it without your in-person assistance. And at that point, you no longer need to send them
[03:29.760 --> 03:35.340]  an email. As highlighted in the file hosting proposition, it would be pointless to send both
[03:35.340 --> 03:40.460]  the encrypted file and the key via email. While the plural of insecure channels is not synonymous
[03:40.460 --> 03:45.960]  with a secure channel, it may acceptably lower the risk. Unknown entities snooping my email?
[03:45.960 --> 03:51.520]  Unacceptable. Unknown entities snooping both my email and my phone? That could be highly probable
[03:51.520 --> 03:57.760]  for some, but my self-assessment leans towards low risk for me. So what tools are available by default?
[03:58.480 --> 04:05.620]  Well, PKZip is just everywhere. Windows, Mac OS, Linux, choose your BSD. You name it, they probably
[04:05.620 --> 04:10.080]  ship a zip tool with the installation media. But because of licensing issues, they won't be
[04:10.080 --> 04:15.460]  shipping the zip format that uses the AES cipher. Oh no, they'll be using the homegrown PKZip cipher.
[04:16.120 --> 04:24.580]  And oh boy, PKZip is so broken. Hashtag, never roll your own crypto. Shocking, right? From an attacker's
[04:24.580 --> 04:29.860]  perspective, having foreknowledge of 13 consecutive plain text bytes may not be any burden at all,
[04:29.860 --> 04:35.040]  and they'll have that file cracked in a couple hours. Or as was demonstrated just the other day,
[04:35.040 --> 04:38.700]  under other circumstances, the attacker doesn't even need to have foreknowledge of any plain text
[04:38.700 --> 04:44.020]  bytes, and they'll still be able to crack it in a couple hours. Even against less sophisticated
[04:44.020 --> 04:47.940]  attackers who won't try and break the cipher, but instead will just try and crack the password,
[04:47.940 --> 04:51.680]  it's lack of resistance combined with human imperfection of password choices means that
[04:51.680 --> 04:55.940]  odds are they'll win in a reasonable amount of time with a modest amount of resources.
[04:56.180 --> 05:02.060]  Fun fact, a potential employer once sent me a zip encrypted file the day before they would
[05:02.060 --> 05:06.560]  interview me about the code inside. With nothing better to do at the time, I ran a cracker on it,
[05:06.560 --> 05:11.960]  and was frankly disappointed to quickly discover the password was the name of the company's product.
[05:12.600 --> 05:16.180]  And it's those kind of human-computer interaction challenges that really undermine password-based
[05:16.180 --> 05:20.260]  security. Here's a quote by Dave Kearns paraphrasing something Winston Churchill said
[05:20.260 --> 05:25.060]  about democracy. Passwords are the worst form of authentication except for all those other
[05:25.060 --> 05:30.740]  methods that have been tried from time to time. 500 years. That's how long it would take to get
[05:30.740 --> 05:35.940]  at least 50% probability of successfully guessing a cryptographically random 10-character U.S.
[05:35.940 --> 05:41.120]  keyboard-based password, assuming a guess rate of 47 quadrillion passwords a year.
[05:41.300 --> 05:44.840]  But it means humans having to deal with credentials that look like this.
[05:44.840 --> 05:48.960]  This isn't something I could read over the phone to someone or even put in a text message.
[05:48.960 --> 05:54.380]  That character next to the Q, is that a capital I or an L or the pipe character?
[05:54.400 --> 05:59.100]  These are good for computers, but they're bad for people. 700 years. That's how long it would
[05:59.100 --> 06:03.520]  take to get at least 50% probability of successfully guessing a cryptographically random
[06:03.520 --> 06:08.580]  six-word English-based passphrase. Again, assuming a guess rate of 47 quadrillion
[06:08.580 --> 06:13.380]  passphrases a year. And instead, we're dealing with strings that look like this.
[06:13.380 --> 06:17.380]  This is something I could read over the phone or put in a text message and have very little
[06:17.380 --> 06:21.660]  risk of misinterpretation by the other party, necessitating rather frustrating error correction
[06:21.660 --> 06:27.680]  protocols, just open my file. But all those confidence-inspiring numbers are only true
[06:27.680 --> 06:31.440]  when it's the computer that generates the credential instead of the human. Humans are
[06:31.440 --> 06:35.580]  really bad at pretending to be computers and consistently fail the reverse Turing tests
[06:35.580 --> 06:40.120]  that are thrown at them. Like generating random strings. Computers are great at this,
[06:40.120 --> 06:44.620]  but we keep asking people to do this, despite overwhelming evidence that they simply can't.
[06:44.620 --> 06:50.220]  If they could, we wouldn't be able to compile the top 100 common passwords, but we can,
[06:50.220 --> 06:55.100]  because they can't. Even if they could, password policies ensure that cryptographically random
[06:55.100 --> 07:00.380]  strings won't be accepted. If there's a policy that says, must include one each of uppercase,
[07:00.380 --> 07:04.580]  lowercase, numeric, and special characters, all that's been accomplished in this attacker,
[07:04.580 --> 07:08.180]  all that's been accomplished is the attacker search space has been reduced.
[07:08.200 --> 07:12.500]  Another fun fact, at a previous job, we tried to crack the employee passwords that had a similar
[07:12.500 --> 07:17.100]  policy in place. The most popular cracked password, the company name, followed by the
[07:17.100 --> 07:21.200]  current year, followed by an exclamation point. It was easy, then, for us to predict what we'd
[07:21.200 --> 07:27.600]  find the following year. Of course, generating one is just half the challenge. Do you remember
[07:27.600 --> 07:31.760]  the six-character password I showed a few slides back? This might be good news or bad news,
[07:31.760 --> 07:35.820]  depending on which side you wanted to take in the robot apocalypse, but you might be an Android.
[07:35.940 --> 07:40.660]  The rest of us meatbags have a really hard time remembering stuff all the time, especially when
[07:40.660 --> 07:45.520]  there's no pattern for us to match on. So one of the tricks we've learned is that we use those
[07:45.520 --> 07:49.520]  little trouble tokens everywhere we possibly can. Folks will figure out their quote-unquote
[07:49.520 --> 07:54.940]  one good password and reuse that bugger everywhere. But inevitably, someone will store that
[07:54.940 --> 07:59.160]  password insecurely, and now all their accounts are compromised across all the services,
[07:59.160 --> 08:03.520]  and they have to come up with a new good one good password, until the next compromise.
[08:04.480 --> 08:09.180]  So let's stop asking humans to be computers and embrace what computers are better at
[08:09.180 --> 08:14.060]  than them. To that end, I wrote a proof-of-concept tool to demonstrate what a file encryption tool
[08:14.060 --> 08:17.920]  that leverages modern cryptography and human-centered design could look like.
[08:18.840 --> 08:25.140]  First, replace the PKZIP cipher with AES-SIV. AES-SIV could take an entire talk on its own,
[08:25.140 --> 08:31.320]  but what's most important is its resistance to nonce, reuse, misuse. Even with AES-GCM,
[08:31.320 --> 08:36.900]  if nonces are reused, failure can be catastrophic. With AES-SIV, if nonces are reused,
[08:36.900 --> 08:40.740]  the impact is limited to an attacker observing that the same unknown message was encrypted more
[08:40.740 --> 08:46.680]  than once. Also, replace the CRC-based key derivation function with Argon2. Again,
[08:46.680 --> 08:50.680]  Argon2 is deserving of a talk all to itself, but what's important here is the resistance
[08:50.680 --> 08:56.020]  Argon2 provides against GPU-based cracking. The parameters of Argon2 can be tuned as one sees fit
[08:56.020 --> 08:59.880]  to force the attacker to make difficult trade-off decisions when it comes to rate of guessing
[08:59.880 --> 09:12.230]  and the cost of each guess. And finally, replace asking the human to choose a password and just
[09:12.230 --> 09:18.270]  choose one for them at random. You may be familiar with the popular XKCD comic strip that demonstrates
[09:18.270 --> 09:23.070]  the superiority of a password like correct horse battery staple to troubadour with leet speak
[09:23.070 --> 09:27.630]  characters. What seems to be often misunderstood about that demonstration, though, is the necessity
[09:27.630 --> 09:33.190]  of the words to be random. And so, consequently, some services have started to reject correct horse
[09:33.190 --> 09:37.190]  battery staple as a password because so many people have decided it would be their new one
[09:37.190 --> 09:42.730]  good password. And here, I've chosen four words rather than six because Argon2 enables us to
[09:42.730 --> 09:47.410]  reduce the total number of guesses per year an attacker could make. And so, I can shrink the
[09:47.410 --> 09:54.770]  search space for greater human usability. Okay. So, demo time. But keep in mind, this is supposed
[09:54.770 --> 10:00.030]  to be boring. We're trying to solve real problems for the average person, so complexity is the enemy
[10:00.030 --> 10:05.130]  of adoption. And given the technical difficulties that have and could arise, I decided the most
[10:05.130 --> 10:09.510]  reliable way to demonstrate this was just take a screenshot of my terminal. On the first line,
[10:09.510 --> 10:13.910]  I'm encrypting a file because it doesn't end in the magic.xc suffix it's looking for. There's no
[10:13.910 --> 10:18.510]  need to pass in any other command line arguments because there aren't any. I see the path phrase
[10:18.510 --> 10:22.290]  that was generated for me and the name of the encrypted file. And now I could email that file
[10:22.290 --> 10:26.570]  to someone and call them with the pass phrase and they would follow the other lines. The file
[10:26.570 --> 10:32.210]  is open for decryption because it ends in the magic.xc suffix and the user's prompted for the
[10:32.210 --> 10:35.510]  pass phrase. Having supplied it, the file is decrypted and the name of the decrypted file
[10:35.510 --> 10:39.390]  is displayed for the user. The lines after that demonstrate this proof of concept also works for
[10:39.390 --> 10:46.110]  directory trees as well. But this is just a proof of concept and should not really be used anywhere
[10:46.110 --> 10:53.770]  ever. So why did I do this then? Well, we need the Microsoft's and the Apple's of the world
[10:53.770 --> 10:57.890]  to accept responsibility for the broken cryptography and the human hostile design
[10:57.890 --> 11:03.010]  they ship by default to all their customers who are not security experts and trust that the tools
[11:03.010 --> 11:07.370]  that they provide are up to the task at hand. Remember that folks generally don't want third
[11:07.370 --> 11:12.730]  party software when the first party one seems good enough. That was after all the basis for
[11:12.730 --> 11:18.650]  antitrust suit against Microsoft. So governments may be in a position to drive change here. At
[11:18.650 --> 11:22.150]  least in the U.S., the National Institute of Standards and Technology has been tasked by
[11:22.150 --> 11:27.050]  Senator Ron Wyden to look into the insufficiencies of the tools that are being provided by default
[11:27.050 --> 11:32.630]  and perhaps to find a new sufficient standard. I eagerly await any progress they make here.
[11:32.630 --> 11:36.310]  But if this is a problem space that interests you, you don't need to wait. There are a couple
[11:36.310 --> 11:40.030]  of other projects that are not just proof of concepts but are actually stable and interested
[11:40.030 --> 11:44.810]  in real-world users. Magic Wormhole is an interesting human-centered approach to sharing
[11:44.810 --> 11:50.250]  files directly over network instead of email. And Augie implements modern cryptography for
[11:50.250 --> 11:54.910]  file encryption even if the user experience is not really intended for technical novices.
[11:55.790 --> 11:59.850]  So thank you for your time and if you have any questions or comments for me, I've provided a few
[11:59.850 --> 12:03.630]  links here. The first one is a post I wrote that explores some of the issues here and perhaps a
[12:03.630 --> 12:07.950]  bit more depth with links for all of my references. If you don't want to play, if you just want to
[12:07.950 --> 12:11.530]  play around with the code, you can find it there on GitHub. And if you'd still like to reach out,
[12:11.530 --> 12:18.860]  my DMs are always open on Twitter. And thanks again. Thank you so much for joining us today,
[12:18.860 --> 12:29.560]  David. That was very interesting. See, I don't see too many comments in the chat, but viewers,
[12:29.560 --> 12:47.700]  if you have any questions, put them in the Discord Q&A chat. Let's see. And this is
[12:47.700 --> 12:59.330]  the last talk of the day. Yay! Do you have any thoughts on password expiration?
[13:00.130 --> 13:06.490]  Do I have any thoughts on password expiration? I think that a lot of organizations have got
[13:06.490 --> 13:12.690]  themselves into trouble by making them time-based rather than event-based. Certainly when there's
[13:12.690 --> 13:19.070]  evidence of compromise, that would be a good driver for doing a rotation across one or more
[13:19.070 --> 13:25.770]  of the potentially at-risk users in scope. But when there's policies in place that force people
[13:25.770 --> 13:31.630]  to, as I was forced to in my time during the US federal government to change my password every 90
[13:31.630 --> 13:35.590]  days, like every 90 days, everyone was complaining about how they had to pick a new password,
[13:35.590 --> 13:39.690]  even though there was no evidence that anyone had been compromised in any way whatsoever.
[13:39.690 --> 13:46.930]  Mm-hmm. Thoughts on minimum password length requirements?
[13:48.590 --> 13:54.070]  That's an interesting question. I mean, that kind of gets to one of the things I talk about
[13:54.070 --> 14:01.390]  in the posts that I link to, that past a certain point, you're expecting users to
[14:01.390 --> 14:06.190]  adopt and be comfortable using a password manager, right? We're still more or less
[14:06.190 --> 14:09.570]  comfortable to like, okay, 10 characters is something that someone can remember without
[14:09.570 --> 14:15.390]  software to help them. But once we start talking about 12 and 14 and beyond that,
[14:15.390 --> 14:20.710]  you're stretching the limits of what humans were capable of in the first place. And so I think
[14:20.710 --> 14:26.630]  the sooner organizations can migrate towards supporting their users to get comfortable
[14:26.630 --> 14:31.970]  password managers, or alternatively adopting security keys, then questions about minimum
[14:31.970 --> 14:37.090]  password length become less important in terms of human-computer interaction questions.
[14:37.910 --> 14:42.770]  Yeah, that's great. And there was another question on password managers, but it sounds like you just
[14:42.770 --> 14:49.210]  touched it right there. Cool. And then do you feel like Microsoft is already solving the password
[14:49.210 --> 14:54.870]  issue with passwordless logins? And do you see that gaining widespread adoption in other applications
[14:54.870 --> 15:01.210]  such as cryptography? I think they're certainly solving one part of the problem.
[15:01.210 --> 15:06.230]  They're solving kind of the online service problem. One of the things that I touch on
[15:06.230 --> 15:11.810]  on my post is that this kind of approach to generating the password for the user one time
[15:11.810 --> 15:15.750]  and just showing it to the screen, that's not going to work for a service that you need to
[15:15.750 --> 15:20.870]  interact with multiple times. And so that's, again, where password managers and security
[15:20.870 --> 15:29.970]  keys come back into the scope of things. In this limited case, passwordless authentication
[15:31.350 --> 15:34.990]  isn't going to help out the problem where I have a file I need to email to someone
[15:35.790 --> 15:40.490]  because then without any sort of mechanism for doing key exchange between them,
[15:40.970 --> 15:47.110]  there's no way for me to do any encryption on the file that could be decrypted by the other user.
[15:48.090 --> 15:56.470]  Yeah. All right. Thanks for the talk. Very cool stuff. Do you think it would be better to pursue
[15:56.470 --> 16:01.990]  making the implementations of existing technologies like PGP more user-friendly than
[16:01.990 --> 16:06.930]  trying to build something new? Or do you feel like things are too difficult to make user-friendly as
[16:06.930 --> 16:15.590]  it stands now? That's a hard question. In general, my preference is always to help improve existing
[16:15.590 --> 16:23.050]  tools rather than to start from scratch. But as someone who has not waded in as deeply as, say,
[16:23.050 --> 16:30.450]  folks who have built Magic Worm and Aage, I trust that they have tried to do their best.
[16:30.550 --> 16:35.010]  And that in their final judgment, in order for them to do what they felt was necessary,
[16:35.010 --> 16:43.150]  they needed to build something from scratch. And I think there's nothing preventing projects from
[16:43.150 --> 16:51.240]  cross-pollination of ideas, the possibility to reuse code. So yeah, there's no one-size-fits-all.
[16:53.650 --> 16:57.650]  All right, well thank you for joining us today. I know there's a few more
[16:57.650 --> 17:01.950]  questions that I might have missed in the chat. Hopefully we can get David in there
[17:01.950 --> 17:09.530]  in Discord to answer any more that you have. Yep, I'll be there. Thank you for joining us. Thank you.
